{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a8231e26-6759-406d-aa95-17e5f48d68a2",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "\n",
    "In order to use Llama2, you need access from Meta. We suggest following the steps in this [article](https://medium.com/@yashambekar1804/getting-started-with-llma2-accessing-llama2-70b-model-and-obtaining-hugging-face-api-token-and-8773e19bda26). "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9fb5d47-d6b9-4cdd-b2b4-80dda4eef996",
   "metadata": {},
   "source": [
    "# HR Chatbot using Retrieval Augmented Generation\n",
    "\n",
    "Many companies aim to make HR documents more accessible to employees and contractors. A recent promising approach is building a RAG system on top of HR documents. This is because it\n",
    "- provides a unique response to any query\n",
    "- reduces hallucinations as the answer has to be grounded in the retrieved context\n",
    "- makes the process highly scalable due to automatic question-answering\n",
    "\n",
    "However, this solution possesses its unique challenges. For example\n",
    "- keeping the knowledgebase consistent and up-to-date\n",
    "- run LLMs efficiently\n",
    "- ensuring the generated results are correct and aligned with company communication guidelines\n",
    "- etc.\n",
    "\n",
    "In our case, we have two HR policy sources for our hypothetical company.\n",
    "- An older HR policy, which contains our maternal leave policy and details on manager responsibilities\n",
    "- A newer policy document that contains updated information about management responsibilities - alongside some other HR policy information\n",
    "\n",
    "A good system will be able to:\n",
    "- provide relevant information on maternity leave (only covered in the old documents)\n",
    "- synthesize contradicting information and only present to us the correct documents\n",
    "\n",
    "Regarding synthesizing information, there are contradictions between the two documents regarding management responsibilities. We are going to use the creation date as a proxy to know for two similar documents with \n",
    "- very similar information, but\n",
    "- different wording, and\n",
    "- slightly altered in one piece of important information\n",
    "which one is more relevant to our query."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e296da7d-1bce-4bd4-8edf-ff0ce00addde",
   "metadata": {},
   "source": [
    "## Boilerplate\n",
    "\n",
    "### Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "0176e0bf-0e5c-46cb-8483-401cb7055039",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%pip install superlinked==3.38.0\n",
    "%pip install -U jupyter ipywidgets\n",
    "%pip install -U torch transformers==4.37 accelerate"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "033c537c-51df-4518-bdf9-851d6c95fec0",
   "metadata": {},
   "source": [
    "### Imports and constants"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "695ab4be-2a2b-4ea0-bae4-8de17f610ea3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from datetime import timedelta, datetime\n",
    "\n",
    "import altair as alt\n",
    "import pandas as pd\n",
    "import requests\n",
    "import torch\n",
    "from transformers import AutoTokenizer, pipeline\n",
    "\n",
    "from superlinked.framework.common.dag.context import CONTEXT_COMMON, CONTEXT_COMMON_NOW\n",
    "from superlinked.framework.common.dag.period_time import PeriodTime\n",
    "from superlinked.framework.common.schema.schema import schema\n",
    "from superlinked.framework.common.schema.schema_object import String, Timestamp\n",
    "from superlinked.framework.common.schema.id_schema_object import IdField\n",
    "from superlinked.framework.common.parser.dataframe_parser import DataFrameParser\n",
    "from superlinked.framework.dsl.executor.in_memory.in_memory_executor import (\n",
    "    InMemoryExecutor,\n",
    ")\n",
    "from superlinked.framework.dsl.index.index import Index\n",
    "from superlinked.framework.dsl.query.param import Param\n",
    "from superlinked.framework.dsl.query.query import Query\n",
    "from superlinked.framework.dsl.query.result import Result\n",
    "from superlinked.framework.dsl.source.in_memory_source import InMemorySource\n",
    "from superlinked.framework.dsl.space.text_similarity_space import (\n",
    "    TextSimilaritySpace,\n",
    "    chunk,\n",
    ")\n",
    "from superlinked.framework.dsl.space.recency_space import RecencySpace\n",
    "\n",
    "alt.renderers.enable(\"mimetype\")\n",
    "alt.data_transformers.disable_max_rows()\n",
    "pd.set_option(\"display.max_colwidth\", 1000)\n",
    "START_OF_2024_TS = int(datetime(2024, 1, 2).timestamp())\n",
    "EXECUTOR_DATA = {CONTEXT_COMMON: {CONTEXT_COMMON_NOW: START_OF_2024_TS}}\n",
    "TOP_N = 10"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd9493a0-77b7-494b-8fb5-3b9991237404",
   "metadata": {},
   "source": [
    "## Load data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "875c1ca8-3d50-410a-9c90-395e9ae45831",
   "metadata": {},
   "outputs": [],
   "source": [
    "# the new HR policy document\n",
    "r_new = requests.get(\n",
    "    \"https://storage.googleapis.com/superlinked-notebook-hr-knowledgebase/hr_rag_knowledgebase.txt\"\n",
    ")\n",
    "r_new.encoding = \"utf-8-sig\"\n",
    "text_new = r_new.text.replace(\"\\r\\n\", \"\\n\").split(\"\\n\")\n",
    "r_old = requests.get(\n",
    "    \"https://storage.googleapis.com/superlinked-notebook-hr-knowledgebase/hr_rag_old_text.txt\"\n",
    ")\n",
    "r_old.encoding = \"utf-8-sig\"\n",
    "text_old = r_old.text.replace(\"\\r\\n\", \"\\n\").split(\"\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3499e2ff-43b6-437c-aee6-9edb640aa2bc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>index</th>\n",
       "      <th>body</th>\n",
       "      <th>creation_date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.</td>\n",
       "      <td>1704063600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>This policy applies to all employees of &lt;Company Name&gt;, consultants, contractors and other third-party entities with access to &lt;Company Name&gt; production networks and system resources.</td>\n",
       "      <td>1704063600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>Background verification checks on &lt;Company Name&gt; personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to &lt;Company Name&gt; production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to &lt;Company Name&gt;.</td>\n",
       "      <td>1704063600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>The skills and competence of employees and contractors shall be assessed by human resources staff and the hiring manager or his or her designees as part of the hiring process. Required skills and competencies shall be listed in job descriptions and requisitions, and/or aligned with the responsibilities outlined in the Roles and Responsibilities Policy. Competency evaluations may include reference checks, education and certification verifications, technical testing, and interviews.</td>\n",
       "      <td>1704063600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>All &lt;Company Name&gt; employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.</td>\n",
       "      <td>1704063600</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   index  \\\n",
       "0      0   \n",
       "1      1   \n",
       "2      2   \n",
       "3      3   \n",
       "4      4   \n",
       "\n",
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    body  \\\n",
       "0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.   \n",
       "1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                This policy applies to all employees of <Company Name>, consultants, contractors and other third-party entities with access to <Company Name> production networks and system resources.   \n",
       "2  Background verification checks on <Company Name> personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to <Company Name> production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to <Company Name>.   \n",
       "3                                                                                                                                                                  The skills and competence of employees and contractors shall be assessed by human resources staff and the hiring manager or his or her designees as part of the hiring process. Required skills and competencies shall be listed in job descriptions and requisitions, and/or aligned with the responsibilities outlined in the Roles and Responsibilities Policy. Competency evaluations may include reference checks, education and certification verifications, technical testing, and interviews.   \n",
       "4                                                                                                                                                                                                                                                                                                                                                                                                                    All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.   \n",
       "\n",
       "   creation_date  \n",
       "0     1704063600  \n",
       "1     1704063600  \n",
       "2     1704063600  \n",
       "3     1704063600  \n",
       "4     1704063600  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text_df = pd.DataFrame(text_new + text_old, columns=[\"body\"]).reset_index()\n",
    "# add timestamps to diffentiate the two sources\n",
    "text_df[\"creation_date\"] = [int(datetime(2024, 1, 1).timestamp())] * len(text_new) + [\n",
    "    int(datetime(2023, 1, 1).timestamp())\n",
    "] * len(text_old)\n",
    "text_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7f531cd-2a14-4cc4-a045-2875d4246d80",
   "metadata": {},
   "source": [
    "## Set up Superlinked for Retrieval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "15c6b58a-9b46-4a72-97bc-94b3fa1c3c92",
   "metadata": {},
   "outputs": [],
   "source": [
    "# typed schema to describe our inputs\n",
    "@schema\n",
    "class ParagraphSchema:\n",
    "    body: String\n",
    "    created_at: Timestamp\n",
    "    id: IdField"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "68df0aca-c63a-4b30-af2f-ad40313c39f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "paragraph = ParagraphSchema()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "09e0fee5-9206-4a92-95df-aacbbed32f96",
   "metadata": {},
   "outputs": [],
   "source": [
    "# relevance space will encode our knowledgebase utilizing chunking to control the granularity of information\n",
    "relevance_space = TextSimilaritySpace(\n",
    "    text=chunk(paragraph.body, chunk_size=100, chunk_overlap=20),\n",
    "    model=\"sentence-transformers/all-mpnet-base-v2\",\n",
    ")\n",
    "# recency has a periodtime to differentiate between the document created at the beginning of this year and last year\n",
    "recency_space = RecencySpace(\n",
    "    timestamp=paragraph.created_at,\n",
    "    period_time_list=[PeriodTime(timedelta(days=300))],\n",
    "    negative_filter=-0.25,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "38b58202-7f4a-4dde-82df-0bacee012945",
   "metadata": {},
   "outputs": [],
   "source": [
    "paragraph_index = Index([relevance_space, recency_space])\n",
    "paragraph_parser = DataFrameParser(\n",
    "    paragraph, mapping={paragraph.id: \"index\", paragraph.created_at: \"creation_date\"}\n",
    ")\n",
    "source: InMemorySource = InMemorySource(paragraph, parser=paragraph_parser)\n",
    "executor = InMemoryExecutor(\n",
    "    sources=[source], indices=[paragraph_index], context_data=EXECUTOR_DATA\n",
    ")\n",
    "app = executor.run()\n",
    "source.put([text_df])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "bfe88caf-e5ec-4ce7-8665-501b95fa8b02",
   "metadata": {},
   "outputs": [],
   "source": [
    "# our simple query will make a search term possible, and gives us the opportunity to weight input aspects (relevance and recency against each other)\n",
    "knowledgebase_query = (\n",
    "    Query(\n",
    "        paragraph_index,\n",
    "        weights={\n",
    "            relevance_space: Param(\"relevance_weight\"),\n",
    "            recency_space: Param(\"recency_weight\"),\n",
    "        },\n",
    "    )\n",
    "    .find(paragraph)\n",
    "    .similar(relevance_space.text, Param(\"search_query\"))\n",
    "    .limit(TOP_N)\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "492bba0f-a341-4052-82d8-8bf69467511b",
   "metadata": {},
   "source": [
    "## Perform retrieval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "0f6fc37f-1732-4e3d-b2f1-2734d2e8f19e",
   "metadata": {},
   "outputs": [],
   "source": [
    "def present_result(result: Result):\n",
    "    \"\"\"A small helper function to present our query results\"\"\"\n",
    "    df = result.to_pandas()\n",
    "    df[\"year_created\"] = [datetime.fromtimestamp(ts).year for ts in df[\"created_at\"]]\n",
    "    return df.drop(\"created_at\", axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "d6e38954-73da-4c3f-9545-3dc3451c7d23",
   "metadata": {},
   "outputs": [],
   "source": [
    "initial_query_text: str = \"What should management monitor?\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "85ae6a7d-d4b2-418d-9861-d4300dc1492c",
   "metadata": {},
   "source": [
    "First, let's do a simple retrieval based only on text similarity. We can see that the recency weight is set to 0, so the document creation dates are not affecting our results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "a2790c0c-b290-4ac5-8b88-c3466051d719",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>body</th>\n",
       "      <th>id</th>\n",
       "      <th>year_created</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.</td>\n",
       "      <td>6</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Management is tasked with ensuring that information security policies and procedures undergo an bi-annual review, are disseminated and accessible, and that employees and contractors adhere to them throughout their tenure or contract period. The yearly policy assessment must encompass a scrutiny of any interconnected or referenced protocols, standards, or directives.</td>\n",
       "      <td>16</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.</td>\n",
       "      <td>7</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Management must guarantee that information security obligations are conveyed to individuals, either through written job outlines, policies, or alternative documented means that are regularly updated and managed accurately. Compliance with information security protocols and the fulfillment of related duties should form part of the performance evaluation process in all cases.</td>\n",
       "      <td>18</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Management should factor in undue pressures and potential fraud scenarios when devising incentives and delineating roles, duties, and authorizations.</td>\n",
       "      <td>17</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.</td>\n",
       "      <td>8</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.</td>\n",
       "      <td>0</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>All &lt;Company Name&gt; employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.</td>\n",
       "      <td>4</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.</td>\n",
       "      <td>11</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>All &lt;Company Name&gt; employees and third-parties with administrative or privileged technical access to &lt;Company Name&gt; production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.</td>\n",
       "      <td>9</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    body  \\\n",
       "0                                                                                                                                                                                                                                            Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.   \n",
       "1                                                                                                                                                                                                                                                       Management is tasked with ensuring that information security policies and procedures undergo an bi-annual review, are disseminated and accessible, and that employees and contractors adhere to them throughout their tenure or contract period. The yearly policy assessment must encompass a scrutiny of any interconnected or referenced protocols, standards, or directives.   \n",
       "2                                                                                                                                                                                                     Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.   \n",
       "3                                                                                                                                                                                                                                               Management must guarantee that information security obligations are conveyed to individuals, either through written job outlines, policies, or alternative documented means that are regularly updated and managed accurately. Compliance with information security protocols and the fulfillment of related duties should form part of the performance evaluation process in all cases.   \n",
       "4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Management should factor in undue pressures and potential fraud scenarios when devising incentives and delineating roles, duties, and authorizations.   \n",
       "5                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.   \n",
       "6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.   \n",
       "7                                                                                                                                                                                                                                                                                                                                                                                    All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.   \n",
       "8                                                                                                                                                                                                                                                                         Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.   \n",
       "9  All <Company Name> employees and third-parties with administrative or privileged technical access to <Company Name> production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.   \n",
       "\n",
       "   id  year_created  \n",
       "0   6          2024  \n",
       "1  16          2023  \n",
       "2   7          2024  \n",
       "3  18          2023  \n",
       "4  17          2023  \n",
       "5   8          2024  \n",
       "6   0          2024  \n",
       "7   4          2024  \n",
       "8  11          2024  \n",
       "9   9          2024  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "only_relevance_result = app.query(\n",
    "    knowledgebase_query,\n",
    "    relevance_weight=1,\n",
    "    recency_weight=0,\n",
    "    search_query=initial_query_text,\n",
    ")\n",
    "\n",
    "present_result(only_relevance_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97b7e9a7-04e7-4c03-b602-2fb70f1acae3",
   "metadata": {},
   "source": [
    "Let's observe the difference between elements with ids 16 and 6, respectively! These documents essentially say the same thing, except that the old document prescribes a bi-annual, while the fresh one an annual review on policies and procedures. Let's see what happens if we upweight recency!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "704f409e-9496-4dc1-9734-33164b505731",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>body</th>\n",
       "      <th>id</th>\n",
       "      <th>year_created</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.</td>\n",
       "      <td>6</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.</td>\n",
       "      <td>7</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.</td>\n",
       "      <td>8</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.</td>\n",
       "      <td>0</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>All &lt;Company Name&gt; employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.</td>\n",
       "      <td>4</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.</td>\n",
       "      <td>11</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>All &lt;Company Name&gt; employees and third-parties with administrative or privileged technical access to &lt;Company Name&gt; production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.</td>\n",
       "      <td>9</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>In order to maintain a robust level of security awareness, the company will provide security-related updates and communications to company personnel on an on-going basis through multiple communication channels as needed.</td>\n",
       "      <td>10</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Management is tasked with ensuring that information security policies and procedures undergo an bi-annual review, are disseminated and accessible, and that employees and contractors adhere to them throughout their tenure or contract period. The yearly policy assessment must encompass a scrutiny of any interconnected or referenced protocols, standards, or directives.</td>\n",
       "      <td>16</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Background verification checks on &lt;Company Name&gt; personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to &lt;Company Name&gt; production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to &lt;Company Name&gt;.</td>\n",
       "      <td>2</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    body  \\\n",
       "0                                                                                                                                                                                                                                                                            Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.   \n",
       "1                                                                                                                                                                                                                                     Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.   \n",
       "2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.   \n",
       "3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.   \n",
       "4                                                                                                                                                                                                                                                                                                                                                                                                                    All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.   \n",
       "5                                                                                                                                                                                                                                                                                                         Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.   \n",
       "6                                  All <Company Name> employees and third-parties with administrative or privileged technical access to <Company Name> production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.   \n",
       "7                                                                                                                                                                                                                                                                                                                                                                                                                                           In order to maintain a robust level of security awareness, the company will provide security-related updates and communications to company personnel on an on-going basis through multiple communication channels as needed.   \n",
       "8                                                                                                                                                                                                                                                                                       Management is tasked with ensuring that information security policies and procedures undergo an bi-annual review, are disseminated and accessible, and that employees and contractors adhere to them throughout their tenure or contract period. The yearly policy assessment must encompass a scrutiny of any interconnected or referenced protocols, standards, or directives.   \n",
       "9  Background verification checks on <Company Name> personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to <Company Name> production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to <Company Name>.   \n",
       "\n",
       "   id  year_created  \n",
       "0   6          2024  \n",
       "1   7          2024  \n",
       "2   8          2024  \n",
       "3   0          2024  \n",
       "4   4          2024  \n",
       "5  11          2024  \n",
       "6   9          2024  \n",
       "7  10          2024  \n",
       "8  16          2023  \n",
       "9   2          2024  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mild_recency_result = app.query(\n",
    "    knowledgebase_query,\n",
    "    relevance_weight=1,\n",
    "    recency_weight=0.15,\n",
    "    search_query=initial_query_text,\n",
    ")\n",
    "\n",
    "present_result(mild_recency_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "459c594b-c49a-4cdd-af37-bb1ea2221338",
   "metadata": {},
   "source": [
    "Now, document the relative position of element 6 compared to 16 improved as it received a score boost for it's recency. We can see the same dynamics here between documents with IDs 7 and 18 - doc 7 softened the requirement to \"wherever applicable\" from \"in all cases\". It is a useful feature that documents stating essentially the same can be differentiated through their freshness - let's boost that and see if we can have only recent, not outdated, information in our results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "1b14b33a-36a3-412f-bf7c-5f88262ed84e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>body</th>\n",
       "      <th>id</th>\n",
       "      <th>year_created</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.</td>\n",
       "      <td>6</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.</td>\n",
       "      <td>7</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.</td>\n",
       "      <td>8</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.</td>\n",
       "      <td>0</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>All &lt;Company Name&gt; employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.</td>\n",
       "      <td>4</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.</td>\n",
       "      <td>11</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>All &lt;Company Name&gt; employees and third-parties with administrative or privileged technical access to &lt;Company Name&gt; production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.</td>\n",
       "      <td>9</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>In order to maintain a robust level of security awareness, the company will provide security-related updates and communications to company personnel on an on-going basis through multiple communication channels as needed.</td>\n",
       "      <td>10</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Background verification checks on &lt;Company Name&gt; personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to &lt;Company Name&gt; production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to &lt;Company Name&gt;.</td>\n",
       "      <td>2</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Company policies and information security roles and responsibilities shall be communicated to employees and third-parties at the time of hire or engagement, and employees and contractors are required to formally acknowledge their understanding and acceptance of their security responsibilities. Employees and third-parties with access to company or customer information shall sign an appropriate non-disclosure, confidentiality, and appropriate code-of-conduct agreements. Contractual agreements shall state responsibilities for information security as needed. Employees and relevant third-parties shall follow all &lt;Company Name&gt; information security policies.</td>\n",
       "      <td>5</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   body  \\\n",
       "0                                                                                                                                                                                                                                                                                           Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.   \n",
       "1                                                                                                                                                                                                                                                    Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.   \n",
       "2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.   \n",
       "3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.   \n",
       "4                                                                                                                                                                                                                                                                                                                                                                                                                                   All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.   \n",
       "5                                                                                                                                                                                                                                                                                                                        Information security leaders and managers shall ensure appropriate professional development occurs to provide an understanding of current threats and trends in the security landscape. Security leaders and key stakeholders shall attend trainings, obtain and maintain relevant certifications, and maintain memberships in industry groups as appropriate.   \n",
       "6                                                 All <Company Name> employees and third-parties with administrative or privileged technical access to <Company Name> production systems and networks shall complete security awareness training at the time of hire and annually thereafter. Management shall monitor training completion and shall take appropriate steps to ensure compliance with this policy. Employees and contractors shall be aware of relevant information security and data privacy policies and procedures. The company shall ensure that personnel receive security and data privacy training appropriate to their role and data handling responsibilities.   \n",
       "7                                                                                                                                                                                                                                                                                                                                                                                                                                                          In order to maintain a robust level of security awareness, the company will provide security-related updates and communications to company personnel on an on-going basis through multiple communication channels as needed.   \n",
       "8                 Background verification checks on <Company Name> personnel shall be carried out in accordance with relevant laws, regulations, and shall be proportional to the business requirements, the classification of the information to be accessed, and the perceived risks. Background screening shall include criminal history checks unless prohibited by local statute. All third-parties with technical privileged or administrative access to <Company Name> production systems or networks are subject to a background check or requirement to provide evidence of an acceptable background, based on their level of access and the perceived risk to <Company Name>.   \n",
       "9  Company policies and information security roles and responsibilities shall be communicated to employees and third-parties at the time of hire or engagement, and employees and contractors are required to formally acknowledge their understanding and acceptance of their security responsibilities. Employees and third-parties with access to company or customer information shall sign an appropriate non-disclosure, confidentiality, and appropriate code-of-conduct agreements. Contractual agreements shall state responsibilities for information security as needed. Employees and relevant third-parties shall follow all <Company Name> information security policies.   \n",
       "\n",
       "   id  year_created  \n",
       "0   6          2024  \n",
       "1   7          2024  \n",
       "2   8          2024  \n",
       "3   0          2024  \n",
       "4   4          2024  \n",
       "5  11          2024  \n",
       "6   9          2024  \n",
       "7  10          2024  \n",
       "8   2          2024  \n",
       "9   5          2024  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normal_recency_result = app.query(\n",
    "    knowledgebase_query,\n",
    "    relevance_weight=1,\n",
    "    recency_weight=0.25,\n",
    "    search_query=initial_query_text,\n",
    ")\n",
    "\n",
    "norm_recency_result_df = present_result(normal_recency_result)\n",
    "norm_recency_result_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd4f0cf8-d133-4b36-923e-25554d27bcbb",
   "metadata": {},
   "source": [
    "Yes, we can! Now that all results are fresh, the outdated counterparts were removed from the result set. This way we can supply a nice and clean context to the generation model.\n",
    "\n",
    "It is also important to see that the same weights produce relevant results for queries about maternity leave—a topic only discussed in the older HR document. It does not get overruled by recent but not relevant documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "493ef42f-4d45-4bda-bb5f-af56f27569fd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>body</th>\n",
       "      <th>id</th>\n",
       "      <th>year_created</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[Company Name] is committed to supporting employees during significant life events such as the birth or adoption of a child. Our Maternity Leave Policy aims to provide eligible employees with the necessary time off to bond with their newborn or newly adopted child and to ensure a smooth transition back to work.</td>\n",
       "      <td>22</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Following childbirth or adoption, employees are entitled to [XX weeks/months] of paid maternity leave to recover from childbirth, bond with their newborn or adopted child, and adjust to their new family dynamic. This period of leave is intended to promote the physical and emotional well-being of both the parent and child during the crucial early stages of development.</td>\n",
       "      <td>26</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[Company Name] is committed to maintaining the confidentiality of employees' maternity leave and pregnancy-related information in accordance with applicable privacy laws. Additionally, [Company Name] prohibits discrimination or retaliation against employees for exercising their rights under this policy or applicable law.</td>\n",
       "      <td>30</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Eligible employees are entitled to up to [XX weeks/months] of paid maternity leave, which includes both prenatal and postnatal leave. Additionally, eligible employees may also take up to [XX weeks/months] of unpaid maternity leave following the conclusion of their paid leave, as provided under the Family and Medical Leave Act (FMLA) and applicable state laws.</td>\n",
       "      <td>24</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[Company Name] recognizes the importance of work-life balance and supports flexible work arrangements, including telecommuting and adjusted work schedules, to facilitate a smooth transition back to work for new parents. Employees returning from maternity leave are encouraged to discuss their individual needs and preferences with their supervisor and HR department to explore available options.</td>\n",
       "      <td>27</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>All full-time and part-time employees are eligible for maternity leave if they have been employed by [Company Name] for at least 12 months and have worked a minimum of 1,250 hours during the 12-month period preceding the commencement of leave. This policy applies equally to biological mothers, adoptive parents, and employees expecting the birth of a child through surrogacy.</td>\n",
       "      <td>23</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>At the conclusion of maternity leave, employees are expected to return to their regular duties and responsibilities. [Company Name] will make reasonable efforts to accommodate the transition back to work, including providing any necessary training or support to help employees reintegrate into the workplace successfully.</td>\n",
       "      <td>29</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Employee and contractor termination and offboarding processes shall ensure that physical and logical access is promptly revoked in accordance with company SLAs and policies, and that all company issued equipment is returned. Any security or confidentiality agreements which remain valid after termination shall be communicated to the employee or contractor at time of termination.</td>\n",
       "      <td>12</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>This policy applies to all employees of &lt;Company Name&gt;, consultants, contractors and other third-party entities with access to &lt;Company Name&gt; production networks and system resources.</td>\n",
       "      <td>1</td>\n",
       "      <td>2024</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>This Maternity Leave Policy will be reviewed periodically by the HR department to ensure compliance with relevant laws and regulations and to assess its effectiveness in supporting the needs of employees and their families.</td>\n",
       "      <td>32</td>\n",
       "      <td>2023</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                                                                                                                                                                                                                                                                                                                                          body  \\\n",
       "0                                                                                     [Company Name] is committed to supporting employees during significant life events such as the birth or adoption of a child. Our Maternity Leave Policy aims to provide eligible employees with the necessary time off to bond with their newborn or newly adopted child and to ensure a smooth transition back to work.   \n",
       "1                           Following childbirth or adoption, employees are entitled to [XX weeks/months] of paid maternity leave to recover from childbirth, bond with their newborn or adopted child, and adjust to their new family dynamic. This period of leave is intended to promote the physical and emotional well-being of both the parent and child during the crucial early stages of development.   \n",
       "2                                                                           [Company Name] is committed to maintaining the confidentiality of employees' maternity leave and pregnancy-related information in accordance with applicable privacy laws. Additionally, [Company Name] prohibits discrimination or retaliation against employees for exercising their rights under this policy or applicable law.   \n",
       "3                                    Eligible employees are entitled to up to [XX weeks/months] of paid maternity leave, which includes both prenatal and postnatal leave. Additionally, eligible employees may also take up to [XX weeks/months] of unpaid maternity leave following the conclusion of their paid leave, as provided under the Family and Medical Leave Act (FMLA) and applicable state laws.   \n",
       "4  [Company Name] recognizes the importance of work-life balance and supports flexible work arrangements, including telecommuting and adjusted work schedules, to facilitate a smooth transition back to work for new parents. Employees returning from maternity leave are encouraged to discuss their individual needs and preferences with their supervisor and HR department to explore available options.   \n",
       "5                     All full-time and part-time employees are eligible for maternity leave if they have been employed by [Company Name] for at least 12 months and have worked a minimum of 1,250 hours during the 12-month period preceding the commencement of leave. This policy applies equally to biological mothers, adoptive parents, and employees expecting the birth of a child through surrogacy.   \n",
       "6                                                                            At the conclusion of maternity leave, employees are expected to return to their regular duties and responsibilities. [Company Name] will make reasonable efforts to accommodate the transition back to work, including providing any necessary training or support to help employees reintegrate into the workplace successfully.   \n",
       "7                 Employee and contractor termination and offboarding processes shall ensure that physical and logical access is promptly revoked in accordance with company SLAs and policies, and that all company issued equipment is returned. Any security or confidentiality agreements which remain valid after termination shall be communicated to the employee or contractor at time of termination.   \n",
       "8                                                                                                                                                                                                                      This policy applies to all employees of <Company Name>, consultants, contractors and other third-party entities with access to <Company Name> production networks and system resources.   \n",
       "9                                                                                                                                                                              This Maternity Leave Policy will be reviewed periodically by the HR department to ensure compliance with relevant laws and regulations and to assess its effectiveness in supporting the needs of employees and their families.   \n",
       "\n",
       "   id  year_created  \n",
       "0  22          2023  \n",
       "1  26          2023  \n",
       "2  30          2023  \n",
       "3  24          2023  \n",
       "4  27          2023  \n",
       "5  23          2023  \n",
       "6  29          2023  \n",
       "7  12          2024  \n",
       "8   1          2024  \n",
       "9  32          2023  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normal_recency_result = app.query(\n",
    "    knowledgebase_query,\n",
    "    relevance_weight=1,\n",
    "    recency_weight=0.25,\n",
    "    search_query=\"What are the companies terms for maternal leave?\",\n",
    ")\n",
    "\n",
    "maternity_result = present_result(normal_recency_result)\n",
    "maternity_result"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14cad191-6a81-4456-a0f8-7611a344c2f8",
   "metadata": {},
   "source": [
    "## Augmentation - formulate query for our LLM for generation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8df1176-7a04-4680-bdf9-33d6ac5da76e",
   "metadata": {},
   "source": [
    "To keep this light and tool independent, we just manually craft our query template based on [LLama2 instructions from HuggingFace](https://huggingface.co/blog/llama2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "ee35ac56-dc6f-4256-86ea-95ae1089da64",
   "metadata": {},
   "outputs": [],
   "source": [
    "context_items_from_retrieval: int = 5\n",
    "context_text: str = (\n",
    "    \"\\n\"\n",
    "    + \"\\n\".join(\n",
    "        f for f in norm_recency_result_df[\"body\"].iloc[:context_items_from_retrieval]\n",
    "    )\n",
    "    + \"\\n\"\n",
    ")\n",
    "\n",
    "query = f\"\"\"<s>[INST] <<SYS>>\n",
    "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n",
    "\n",
    "If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n",
    "<</SYS>>\n",
    "\n",
    "Please answer the following question by using information from the provided context information!\n",
    "CONTEXT_INFORMATION: {context_text}\n",
    "QUESTION: {initial_query_text}\n",
    "[/INST]\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "7ce8d635-2e93-4f86-8042-67ef12e1dc2a",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<s>[INST] <<SYS>>\n",
      "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n",
      "\n",
      "If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n",
      "<</SYS>>\n",
      "\n",
      "Please answer the following question by using information from the provided context information!\n",
      "CONTEXT_INFORMATION: \n",
      "Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.\n",
      "Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.\n",
      "Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.\n",
      "To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.\n",
      "All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.\n",
      "\n",
      "QUESTION: What should management monitor?\n",
      "[/INST]\n"
     ]
    }
   ],
   "source": [
    "print(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d700545e-45ec-43c7-b222-9e69e6f7d9ab",
   "metadata": {},
   "source": [
    "## Generation - prompt the LLM for text generation to answer our query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "6da4d2b1-af52-4d2b-9b73-70ec6cfe6a9b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "da79808f3695402d88ea7056031d3230",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# load the 7B parameter Llama model for chat from Huggingface - for the first run this will download model, for subsequent ones it will load it\n",
    "model = \"meta-llama/Llama-2-7b-chat-hf\"\n",
    "\n",
    "# we use the pure transformers library without popular wrappers - but LLamaIndex and LangChain can easily be plugged in here as those use transformers under the hood too\n",
    "tokenizer = AutoTokenizer.from_pretrained(model)\n",
    "pipeline = pipeline(\n",
    "    \"text-generation\",\n",
    "    model=model,\n",
    "    torch_dtype=torch.float16,\n",
    "    device_map=\"auto\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "e6e8adf0-098c-4269-9292-06efff7cf40e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Result: <s>[INST] <<SYS>>\n",
      "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n",
      "\n",
      "If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n",
      "<</SYS>>\n",
      "\n",
      "Please answer the following question by using information from the provided context information!\n",
      "CONTEXT_INFORMATION: \n",
      "Management shall be responsible for ensuring that information security policies and procedures are reviewed annually, distributed and available, and that employees and contractors abide by those policies and procedures for the duration of their employment or engagement. Annual policy review shall include a review of any linked or referenced procedures, standards or guidelines.\n",
      "Management shall ensure that information security responsibilities are communicated to individuals, through written job descriptions, policies or some other documented method which is accurately updated and maintained. Compliance with information security policies and procedures and fulfillment of information security responsibilities shall be evaluated as part of the performance review process wherever applicable.\n",
      "Management shall consider excessive pressures, and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities.\n",
      "To ensure that employees and contractors meet security requirements, understand their responsibilities, and are suitable for their roles.\n",
      "All <Company Name> employees will undergo an annual performance review which will include an assessment of job performance, competence in the role, adherence to company policies and code of conduct, and achievement of role-specific objectives.\n",
      "\n",
      "QUESTION: What should management monitor?\n",
      "[/INST]  Based on the provided context information, management should monitor the following:\n",
      "\n",
      "1. Information security policies and procedures: Management should review and update these policies and procedures annually to ensure they are relevant and effective in protecting the company's information assets.\n",
      "2. Employee and contractor compliance: Management should ensure that employees and contractors are aware of and comply with the company's information security policies and procedures throughout their employment or engagement.\n",
      "3. Performance reviews: Management should include an assessment of employee and contractor compliance with information security policies and procedures as part of their annual performance reviews.\n",
      "4. Job descriptions and responsibilities: Management should ensure that job descriptions accurately reflect the information security responsibilities of each role and are updated as necessary.\n",
      "5. Incentives and segregation of roles: Management should consider excessive pressures and opportunities for fraud when establishing incentives and segregating roles, responsibilities, and authorities to ensure that employees and contractors are suitable for their roles.\n",
      "6. Employee competence: Management should assess employee competence in their roles to ensure that they have the necessary skills and knowledge to perform their duties effectively and securely.\n",
      "7. Role-specific objectives: Management should establish role-specific objectives for employees and contractors to ensure that they are working towards common goals and are aligned with the company's overall information security strategy.\n"
     ]
    }
   ],
   "source": [
    "# we prompt the LLM for text generation with our RAG query\n",
    "sequences = pipeline(\n",
    "    query,\n",
    "    do_sample=True,\n",
    "    top_k=10,\n",
    "    num_return_sequences=1,\n",
    "    eos_token_id=tokenizer.eos_token_id,\n",
    "    pad_token_id=tokenizer.eos_token_id,\n",
    ")\n",
    "for seq in sequences:\n",
    "    print(f\"Result: {seq['generated_text']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eba6a539-144c-46f5-bee8-88e64f34ae87",
   "metadata": {},
   "source": [
    "## Evaluation of the answer"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9aaf7e68-3847-4336-a420-e429ee470743",
   "metadata": {},
   "source": [
    "The results are structured, contain the information present in the context documents, and are tailored to our question. Based on hardware, it will run in a manageable time. \n",
    "In 3), we can see that the generated text contains the correct annual term. This is mainly because it was fed the right thing as context information."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "superlinked-py3.10",
   "language": "python",
   "name": "superlinked-py3.10"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
